Skip to content

ci: trigger eval-skills on every PR so required checks always report#79

Merged
tiffanylphan merged 1 commit into
mainfrom
fix/eval-skills-trigger-on-all-prs
Jun 15, 2026
Merged

ci: trigger eval-skills on every PR so required checks always report#79
tiffanylphan merged 1 commit into
mainfrom
fix/eval-skills-trigger-on-all-prs

Conversation

@tiffanylphan

@tiffanylphan tiffanylphan commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

The required checks come from a path-filtered workflow, so PRs that don't touch skills/** or evals/** never trigger it and get stuck "waiting for status to report." This PR removes the path filter so the workflow fires on every PR. Nothing is weakened: the diff job inside the workflow already gates the actual eval work on whether skills changed, and skipped required checks count as passing in branch protection. Net cost: ~40s of cheap CI per PR; no extra API tokens.


Branch protection on main requires three status checks — Unit tests, Aggregate scores, and Evaluate gate — all of which are produced by .github/workflows/eval-skills.yml. The workflow is currently path-filtered:

on:
  pull_request:
    paths:
      - "skills/**"
      - "evals/**"
      - ".github/workflows/eval-skills.yml"

For any PR that doesn't touch one of those paths, the workflow never triggers, so the three required checks stay forever in Expected — Waiting for status to be reported and the PR is BLOCKED from merging even with every other check green and reviews approved.

#74 hit this — it only changed .claude-plugin/marketplace.json and README.md, so the workflow never fired and the merge button has been stuck for days.

Fix

Remove the trigger-level paths: filter so the workflow runs on every PR. The cost is small:

  • The diff job is preserved — it still computes whether any skills actually changed.
  • evaluate and aggregate already have job-level if: needs.diff.outputs.has_changes == 'true' guards, so they skip cleanly on PRs that don't touch skills.
  • GitHub treats skipped required checks as passing, so non-skill PRs still satisfy branch protection without running any actual evals.
  • The only always-on cost added per PR is the unit-test job (~30s) and the diff job (~10s).

The cron-schedule and workflow_dispatch triggers are unchanged.

Test plan

  • Merge this PR (it touches .github/workflows/eval-skills.yml, so the workflow self-triggers under the old paths filter — required checks will report on this PR via the existing rule).
  • Confirm #74 unblocks once this lands — push an empty/no-op commit there so the new trigger fires, and verify all three required checks report SUCCESS or SKIPPED.
  • Open a follow-up PR touching only a non-skills file (e.g. README.md) and confirm the three required checks report there too.

🤖 Generated with Claude Code

Branch protection requires "Unit tests", "Aggregate scores", and
"Evaluate gate" — all of which come from this workflow. With the
`paths:` filter, PRs that don't touch `skills/**`, `evals/**`, or this
file never trigger the workflow, so the three required checks stay
forever in "Expected — Waiting for status to be reported" and block
merge. (#74 hit this.)

Remove the trigger-level `paths:` filter so the workflow always runs on
PRs. The existing diff job and per-job `if:` conditions already
short-circuit the real work when no skills changed, and GitHub treats
skipped required checks as passing.
@tiffanylphan tiffanylphan requested a review from a team as a code owner June 15, 2026 22:48
@github-actions

Copy link
Copy Markdown

Skill eval results

Skill Before After Δ
agentcontrol/configs-create 100/100 (4/4) 75/100 (3/4) -25
agentcontrol/configs-update 80/100 (4/5) 80/100 (4/5) no change
agentcontrol/configs-variations 80/100 (4/5) 80/100 (4/5) no change
agentcontrol/tools 75/100 (3/4) 75/100 (3/4) no change
feature-flags/launchdarkly-flag-command - 100/100 (3/3) new
feature-flags/launchdarkly-flag-create 100/100 (3/3) 100/100 (3/3) no change

Only suites whose source actually changed since their last recorded score were re-run. Soft-failing while we stabilise the baseline.

@tiffanylphan tiffanylphan merged commit 14bde71 into main Jun 15, 2026
13 checks passed
@tiffanylphan tiffanylphan deleted the fix/eval-skills-trigger-on-all-prs branch June 15, 2026 23:24
tiffanylphan added a commit that referenced this pull request Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants